AITopics

Country: North America > United States (0.29)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Neural Information Processing SystemsFeb-17-2026, 17:13:28 GMT

e75dce944052276caf89c17aca8963d3-Paper-Conference.pdf

artificial intelligence, chick, machine learning, (15 more...)

Country:

North America > United States > Indiana (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Neural Information Processing SystemsFeb-11-2026, 15:01:40 GMT

e360367584297ee8d2d5afa709cd440e-Paper.pdf

ann map, correlation, experiment, (16 more...)

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceDec-5-2025

Robust HRRP Recognition under Interrupted Sampling Repeater Jamming using a Prior Jamming Information-Guided Network

Sun, Guozheng, Wang, Lei, Wang, Yanhao, Wang, Jie, Liu, Yimin

Radar automatic target recognition (RATR) based on high-resolution range profile (HRRP) has attracted increasing attention due to its ability to capture fine-grained structural features. However, recognizing targets under electronic countermeasures (ECM), especially the mainstream interrupted-sampling repeater jamming (ISRJ), remains a significant challenge, as HRRPs often suffer from serious feature distortion. To address this, we propose a robust HRRP recognition method guided by prior jamming information. Specifically, we introduce a point spread function (PSF) as prior information to model the HRRP distortion induced by ISRJ. Based on this, we design a recognition network that leverages this prior through a prior-guided feature interaction module and a hybrid loss function to enhance the model's discriminative capability. With the aid of prior information, the model can learn invariant features within distorted HRRP under different jamming parameters. Both the simulated and measured-data experiments demonstrate that our method consistently outperforms state-of-the-art approaches and exhibits stronger generalization capabilities when facing unseen jamming parameters.

artificial intelligence, information, machine learning, (15 more...)

2511.23256

Genre: Research Report (0.84)

Industry:

Transportation > Passenger (0.68)
Transportation > Air (0.68)
Aerospace & Defense > Aircraft (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Nguyen, Minh Hoang, Thiet, Su Nguyen

Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5

arXiv.org Artificial IntelligenceDec-2-2025

Recognizing and processing Classical Chinese (Han-Nom) texts play a vital role in digitizing Vietnamese historical documents and enabling cross-lingual semantic research. However, existing OCR systems struggle with degraded scans, non-standard glyphs, and handwriting variations common in ancient sources. In this work, we propose a fine-tuning approach for PaddleOCRv5 to improve character recognition on Han-Nom texts. We retrain the text recognition module using a curated subset of ancient Vietnamese Chinese manuscripts, supported by a full training pipeline covering preprocessing, LMDB conversion, evaluation, and visualization. Experimental results show a significant improvement over the base model, with exact accuracy increasing from 37.5 percent to 50.0 percent, particularly under noisy image conditions. Furthermore, we develop an interactive demo that visually compares pre- and post-fine-tuning recognition results, facilitating downstream applications such as Han-Vietnamese semantic alignment, machine translation, and historical linguistics research. The demo is available at https://huggingface.co/spaces/MinhDS/Fine-tuned-PaddleOCRv5

machine learning, natural language, pattern recognition, (18 more...)

2510.04003

Country: Asia > Vietnam (0.30)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.73)

arXiv.org Artificial IntelligenceNov-13-2025

Context-Aware Dynamic Chunking for Streaming Tibetan Speech Recognition

Wang, Chao, Cai, Yuqing, Duojie, Renzeng, Zhang, Jin, Liu, Yutong, Tashi, Nyima

ABSTRACT In this work, we propose a streaming speech recognition framework for Amdo Tibetan, built upon a hybrid CTC/Atten-tion architecture with a context-aware dynamic chunking mechanism. The proposed strategy adaptively adjusts chunk widths based on encoding states, enabling flexible receptive fields, cross-chunk information exchange, and robust adaptation to varying speaking rates, thereby alleviating the context truncation problem of fixed-chunk methods. To further capture the linguistic characteristics of Tibetan, we construct a lexicon grounded in its orthographic principles, providing linguistically motivated modeling units. During decoding, an external language model is integrated to enhance semantic consistency and improve recognition of long sentences. Experimental results show that the proposed framework achieves a word error rate (WER) of 6.23% on the test set, yielding a 48.15% relative improvement over the fixed-chunk baseline, while significantly reducing recognition latency and maintaining performance close to global decoding.

machine learning, natural language, recognition, (16 more...)

2511.09085

Country: Asia > China > Tibet Autonomous Region (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.89)

arXiv.org Artificial IntelligenceOct-17-2025

Joint Modeling of Big Five and HEXACO for Multimodal Apparent Personality-trait Recognition

Masumura, Ryo, Orihashi, Shota, Ihori, Mana, Tanaka, Tomohiro, Makishima, Naoki, Yamane, Taiga, Kawata, Naotaka, Suzuki, Satoshi, Katayama, Taichi

This paper proposes a joint modeling method of the Big Five, which has long been studied, and HEXACO, which has recently attracted attention in psychology, for automatically recognizing apparent personality traits from multimodal human behavior. Most previous studies have used the Big Five for multimodal apparent personality-trait recognition. However, no study has focused on apparent HEXACO which can evaluate an Honesty-Humility trait related to displaced aggression and vengefulness, social-dominance orientation, etc. In addition, the relationships between the Big Five and HEXACO when modeled by machine learning have not been clarified. We expect awareness of multimodal human behavior to improve by considering these relationships. The key advance of our proposed method is to optimize jointly recognizing the Big Five and HEXACO. Experiments using a self-introduction video dataset demonstrate that the proposed method can effectively recognize the Big Five and HEXACO.

artificial intelligence, hexaco, machine learning, (18 more...)

2510.14203

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Wang, Zhao-Yang, Shao, Zhimin, Chen, Jieneng, Chellappa, Rama

Combo-Gait: Unified Transformer Framework for Multi-Modal Gait Recognition and Attribute Analysis

arXiv.org Artificial IntelligenceOct-14-2025

Abstract-- Gait recognition is an important biometric for human identification at a distance, particularly under low-resolution or unconstrained environments. Current works typically focus on either 2D representations (e.g., silhouettes and skeletons) or 3D representations (e.g., meshes and SMPLs), but relying on a single modality often fails to capture the full geometric and dynamic complexity of human walking patterns. In this paper, we propose a multi-modal and multi-task framework that combines 2D temporal silhouettes with 3D SMPL features for robust gait analysis. Beyond identification, we introduce a multitask learning strategy that jointly performs gait recognition and human attribute estimation, including age, body mass index (BMI), and gender . A unified transformer is employed to effectively fuse multi-modal gait features and better learn attribute-related representations, while preserving discriminative identity cues. Extensive experiments on the large-scale BRIAR datasets, collected under challenging conditions such as long-range distances (up to 1 km) and extreme pitch angles (up to 50), demonstrate that our approach outperforms state-of-the-art methods in gait recognition and provides accurate human attribute estimation.

gait recognition, machine learning, pattern recognition, (19 more...)

2510.10417

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsAug-18-2025, 05:09:00 GMT

e360367584297ee8d2d5afa709cd440e-Paper.pdf

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
(2 more...)

Gogoi, Parismita, Kalita, Sishir, Lalhminghlui, Wendy, Terhiija, Viyazonuo, Tzudir, Moakala, Sarmah, Priyankoo, Prasanna, S. R. M.

Tone recognition in low-resource languages of North-East India: peeling the layers of SSL-based speech models

arXiv.org Artificial IntelligenceJun-5-2025

This study explores the use of self-supervised learning (SSL) models for tone recognition in three low-resource languages from North Eastern India: Angami, Ao, and Mizo. We evaluate four Wav2vec2.0 base models that were pre-trained on both tonal and non-tonal languages. We analyze tone-wise performance across the layers for all three languages and compare the different models. Our results show that tone recognition works best for Mizo and worst for Angami. The middle layers of the SSL models are the most important for tone recognition, regardless of the pre-training language, i.e. tonal or non-tonal. We have also found that the tone inventory, tone types, and dialectal variations affect tone recognition. These findings provide useful insights into the strengths and weaknesses of SSL-based embeddings for tonal languages and highlight the potential for improving tone recognition in low-resource settings. The source code is available at GitHub 1 .

machine learning, natural language, recognition, (21 more...)

2506.03606

Country:

Europe (1.00)
Asia > India > Nagaland (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)